Search CORE

152 research outputs found

Towards Persistence-Based Reconstruction in Euclidean Spaces

Author: Chazal Frédéric
Oudot Steve
Publication venue
Publication date: 01/01/2007
Field of study

Manifold reconstruction has been extensively studied for the last decade or so, especially in two and three dimensions. Recently, significant improvements were made in higher dimensions, leading to new methods to reconstruct large classes of compact subsets of Euclidean space

\R^d

. However, the complexities of these methods scale up exponentially with d, which makes them impractical in medium or high dimensions, even for handling low-dimensional submanifolds. In this paper, we introduce a novel approach that stands in-between classical reconstruction and topological estimation, and whose complexity scales up with the intrinsic dimension of the data. Specifically, when the data points are sufficiently densely sampled from a smooth

m

-submanifold of

\R^d

, our method retrieves the homology of the submanifold in time at most

c(m)n^5

, where

n

is the size of the input and

c(m)

is a constant depending solely on

m

. It can also provably well handle a wide range of compact subsets of

\R^d

, though with worse complexities. Along the way to proving the correctness of our algorithm, we obtain new results on \v{C}ech, Rips, and witness complex filtrations in Euclidean spaces

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Rates of convergence for robust geometric inference

Author: Chazal Frédéric
Massart Pascal
Michel Bertrand
Publication venue
Publication date: 29/03/2015
Field of study

Distances to compact sets are widely used in the field of Topological Data Analysis for inferring geometric and topological features from point clouds. In this context, the distance to a probability measure (DTM) has been introduced by Chazal et al. (2011) as a robust alternative to the distance a compact set. In practice, the DTM can be estimated by its empirical counterpart, that is the distance to the empirical measure (DTEM). In this paper we give a tight control of the deviation of the DTEM. Our analysis relies on a local analysis of empirical processes. In particular, we show that the rates of convergence of the DTEM directly depends on the regularity at zero of a particular quantile fonction which contains some local information about the geometry of the support. This quantile function is the relevant quantity to describe precisely how difficult is a geometric inference problem. Several numerical experiments illustrate the convergence of the DTEM and also confirm that our bounds are tight

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Data driven estimation of Laplace-Beltrami operator

Author: Chazal Frédéric
Giulini Ilaria
Michel Bertrand
Publication venue
Publication date: 01/01/2016
Field of study

Approximations of Laplace-Beltrami operators on manifolds through graph Lapla-cians have become popular tools in data analysis and machine learning. These discretized operators usually depend on bandwidth parameters whose tuning remains a theoretical and practical problem. In this paper, we address this problem for the unnormalized graph Laplacian by establishing an oracle inequality that opens the door to a well-founded data-driven procedure for the bandwidth selection. Our approach relies on recent results by Lacour and Massart [LM15] on the so-called Lepski's method

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Focus in Ewe

Author: Beaufils Bertrand
Chazal Frédéric
Grelet Marc
Michel Bertrand
Publication venue
Publication date: 01/01/2010
Field of study

International audience—In this paper, a strides detection algorithm is proposed using inertial sensors worn on the ankle. This innovative approach based on geometric patterns can detect both normal walking strides and atypical strides such as small steps, side steps and backward walking that existing methods struggle to detect. It is also robust in critical situations, when for example the wearer is sitting and moving the ankle, while most algorithms in the literature would wrongly detect strides

Ghent University Academic Bibliography

INRIA a CCSD electronic archive server

Optimal rates of convergence for persistence diagrams in Topological Data Analysis

Author: Chazal Frédéric
Glisse Marc
Labruère Catherine
Michel Bertrand
Publication venue
Publication date: 27/05/2013
Field of study

Computational topology has recently known an important development toward data analysis, giving birth to the field of topological data analysis. Topological persistence, or persistent homology, appears as a fundamental tool in this field. In this paper, we study topological persistence in general metric spaces, with a statistical approach. We show that the use of persistent homology can be naturally considered in general statistical frameworks and persistence diagrams can be used as statistics with interesting convergence properties. Some numerical experiments are performed in various contexts to illustrate our results

arXiv.org e-Print Archive

HAL-uB

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

The density of expected persistence diagrams and its kernel based estimation

Author: Chazal Frédéric
Divol Vincent
Publication venue: HAL CCSD
Publication date: 01/06/2018
Field of study

Extended version of the SoCG proceedings, submitted to a journalInternational audiencePersistence diagrams play a fundamental role in Topological Data Analysis where they are used as topological descriptors of filtrations built on top of data. They consist in discrete multisets of points in the plane R 2 that can equivalently be seen as discrete measures in R 2. When the data come as a random point cloud, these discrete measures become random measures whose expectation is studied in this paper. First, we show that for a wide class of filtrations, including the Čech and Rips-Vietoris filtrations, the expected persistence diagram, that is a deterministic measure on R 2 , has a density with respect to the Lebesgue measure. Second, building on the previous result we show that the persistence surface recently introduced in [Adams & al., Persistenceimages: a stable vector representation of persistent homology] can be seen as a kernel estimator of this density. We propose a cross-validation scheme for selecting an optimal bandwidth, which is proven to be a consistent procedure to estimate the density

Optimal quantization of the mean measure and application to clustering of measures

Author: Chazal Frédéric
Levrard Clément
Royer Martin
Publication venue: HAL CCSD
Publication date: 04/02/2020
Field of study

This paper addresses the case where data come as point sets, or more generally as discrete measures. Our motivation is twofold: first we intend to approximate with a compactly supported measure the mean of the measure generating process, that coincides with the intensity measure in the point process framework, or with the expected persistence diagram in the framework of persistence-based topological data analysis. To this aim we provide two algorithms that we prove almost minimax optimal. Second we build from the estimator of the mean measure a vectorization map, that sends every measure into a finite-dimensional Euclidean space, and investigate its properties through a clustering-oriented lens. In a nutshell, we show that in a mixture of measure generating process, our technique yields a representation in

\mathbb{R}^k

, for

k \in \mathbb{N}^*

that guarantees a good clustering of the data points with high probability. Interestingly, our results apply in the framework of persistence-based shape classification via the ATOL procedure described in \cite{Royer19}

Stochastic Convergence of Persistence Landscapes and Silhouettes

Author: Chazal Frédéric
Fasy Brittany Terese
Lecci Fabrizio
Rinaldo Alessandro
Wasserman Larry
Publication venue
Publication date: 01/12/2013
Field of study

Persistent homology is a widely used tool in Topological Data Analysis that encodes multiscale topological information as a multi-set of points in the plane called a persistence diagram. It is difficult to apply statistical theory directly to a random sample of diagrams. Instead, we can summarize the persistent homology with the persistence landscape, introduced by Bubenik, which converts a diagram into a well-behaved real-valued function. We investigate the statistical properties of landscapes, such as weak convergence of the average landscapes and convergence of the bootstrap. In addition, we introduce an alternate functional summary of persistent homology, which we call the silhouette, and derive an analogous statistical theory

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures

Author: Carrière Mathieu
Chazal Frédéric
Ike Yuichi
Lacombe Théo
Royer Martin
Umeda Yuhei
Publication venue
Publication date: 08/03/2020
Field of study

Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In this work, we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

High-Dimensional Topological Data Analysis

Author: Chazal Frédéric
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2016
Field of study

International audienceModern data often come as point clouds embedded in high dimensional Euclidean spaces, or possibly more general metric spaces. They are usually not distributed uniformly, but lie around some highly nonlinear geometric structures with nontrivial topology. Topological data analysis (TDA) is an emerging field whose goal is to provide mathematical and algorithmic tools to understand the topological and geometric structure of data. This chapter provides a short introduction to this new field through a few selected topics. The focus is deliberately put on the mathematical foundations rather than specific applications, with a particular attention to stability results asserting the relevance of the topological information inferred from data

INRIA a CCSD electronic archive server